Evaluating Classifiers
نویسنده
چکیده
In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with unknown labels. The whole point is to make predictions for the test examples. However, in research or experimentation we want to measure the performance achieved by a learning algorithm. To do this we use a test set consisting of examples with known labels. We train the classifier on the training set, apply it to the test set, and then measure performance by comparing the predicted labels with the true labels (which were not available to the training algorithm). Sometimes we have a training set and a test set given already. Other times, we just have one database of labeled training examples. In this case, we have to divide the database ourselves into separate training and test subsets. A common rule of thumb is to use 70% of the database for training and 30% for testing. Dividing the database into training and test subsets is usually done randomly, in order to guarantee that both subsets are random samples from the same distribution. It can be reasonable to do stratified sampling, which means to ensure that each class is present in the exact same proportion in the training and test subsets. It is absolutely vital to measure the performance of a classifier on an independent test set. Every training algorithm looks for patterns in the training data, i.e. correlations between the features and the class. Some of the patterns discovered may be spurious, i.e. they are valid in the training data due to randomness in how the training data was selected from the population, but they are not valid, or not as strong, in the whole population. A classifier that relies on these spurious patterns will have higher accuracy on the training examples than it will on the
منابع مشابه
Discrimination-Based Criteria for the Evaluation of Classifiers
Evaluating the performance of classifiers is a difficult task in machine learning. Many criteria have been proposed and used in such a process. Each criterion measures some facets of classifiers. However, none is good enough for all cases. In this communication, we justify the use of discrimination measures for evaluating classifiers. The justification is mainly based on a hierarchical model fo...
متن کاملClassifier Performance Measures in Multi-Fault Diagnosis for Aircraft Engines
Classifier performance evaluation is an important step in designing diagnostic systems. The purposes of performing classifier performance evaluation include: 1) to select the best classifiers from the several candidate classifiers, 2) to verify that the classifier designed meets the design requirement, and 3) to identify the need for improvements in the classifier components. In order to effect...
متن کاملPerformance Evaluation of Machine Learning Classifiers in Sentiment Mining
In recent years, the use of machine learning classifiers is of great value in solving a variety of problems in text classification. Sentiment mining is a kind of text classification in which, messages are classified according to sentiment orientation such as positive or negative. This paper extends the idea of evaluating the performance of various classifiers to show their effectiveness in sent...
متن کاملPredicting the Quality of Semantic Relations by Applying Machine Learning Classifiers
In this paper, we propose the application of Machine Learning (ML) methods to the Semantic Web (SW) as a mechanism to predict the correctness of semantic relations. For this purpose, we have acquired a learning dataset from the SW and we have performed an extensive experimental evaluation covering more than 1,800 relations of various types. We have obtained encouraging results, reaching a maxim...
متن کاملLearning Classifiers Using Hierarchically Structured Class Taxonomies
We consider classification problems in which the class labels are organized into an abstraction hierarchy in the form of a class taxonomy. We define a structured label classification problem. We explore two approaches for learning classifiers in such a setting. We also develop a class of performance measures for evaluating the resulting classifiers. We present preliminary results that demonstra...
متن کاملProbabilistic Confusion Entropy for Evaluating Classifiers
For evaluating the classification model of an information system, a proper measure is usually needed to determine if the model is appropriate for dealing with the specific domain task. Though many performance measures have been proposed, few measures were specially defined for multi-class problems, which tend to be more complicated than two-class problems, especially in addressing the issue of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007